AITopics | noise injection

Collaborating Authors

noise injection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Noisy Recurrent Neural Networks

Neural Information Processing SystemsApr-25-2026, 05:28:24 GMT

We provide a general framework for studying recurrent neural networks (RNNs) trained by injecting noise into hidden states. Specifically, we consider RNNs that can be viewed as discretizations of stochastic differential equations driven by input data. This framework allows us to study the implicit regularization effect of general noise injection schemes by deriving an approximate explicit regularizer in the small noise regime. We find that, under reasonable assumptions, this implicit regularization promotes flatter minima; it biases towards models with more stable dynamics; and, in classification tasks, it favors models with larger classification margin. Sufficient conditions for global stability are obtained, highlighting the phenomenon of stochastic stabilization, where noise injection can improve stability during training. Our theory is supported by empirical results which demonstrate that the RNNs have improved robustness with respect to various input perturbations.

artificial intelligence, machine learning, perturbation, (14 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

36ab62655fa81ce8735ce7cfdaf7c9e8-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 21:40:59 GMT

noise, probabilistic mapping, robustness, (15 more...)

Neural Information Processing Systems

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
(2 more...)

Genre: Research Report (0.54)

Industry: Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Security & Privacy (0.69)

Add feedback

21912f7057935149fa58408ee8cb460e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 09:07:06 GMT

controller, module, robot, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Truly Deterministic Policy Optimization

Neural Information Processing SystemsDec-24-2025, 01:06:10 GMT

In this paper, we present a policy gradient method that avoids exploratory noise injection and performs policy search over the deterministic landscape, with the goal of improving learning with long horizons and non-local rewards. By avoiding noise injection all sources of estimation variance can be eliminated in systems with deterministic dynamics (up to the initial state distribution). Since deterministic policy regularization is impossible using traditional non-metric measures such as the KL divergence, we derive a Wasserstein-based quadratic model for our purposes. We state conditions on the system model under which it is possible to establish a monotonic policy improvement guarantee, propose a surrogate function for policy gradient estimation, and show that it is possible to compute exact advantage estimates if both the state transition model and the policy are deterministic. Finally, we describe two novel robotic control environments---one with non-local rewards in the frequency domain and the other with a long horizon (8000 time-steps)---for which our policy gradient method (TDPO) significantly outperforms existing methods (PPO, TRPO, DDPG, and TD3). Our implementation with all the experimental settings and a video of the physical hardware test is available at https://github.com/ehsansaleh/tdpo .

deterministic policy optimization, electronic proceedings, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Robots (0.60)

Add feedback

Evaluating the Sensitivity of BiLSTM Forecasting Models to Sequence Length and Input Noise

Albelali, Salma, Ahmed, Moataz

arXiv.org Artificial IntelligenceDec-9-2025

Deep learning (DL) models, a specialized class of multilayer neural networks, have become central to time-series forecasting in critical domains such as environmental monitoring and the Internet of Things (IoT). Among these, Bidirectional Long Short-Term Memory (BiLSTM) architectures are particularly effective in capturing complex temporal dependencies. However, the robustness and generalization of such models are highly sensitive to input data characteristics - an aspect that remains underexplored in existing literature. This study presents a systematic empirical analysis of two key data-centric factors: input sequence length and additive noise. To support this investigation, a modular and reproducible forecasting pipeline is developed, incorporating standardized preprocessing, sequence generation, model training, validation, and evaluation. Controlled experiments are conducted on three real-world datasets with varying sampling frequencies to assess BiLSTM performance under different input conditions. The results yield three key findings: (1) longer input sequences significantly increase the risk of overfitting and data leakage, particularly in data-constrained environments; (2) additive noise consistently degrades predictive accuracy across sampling frequencies; and (3) the simultaneous presence of both factors results in the most substantial decline in model stability. While datasets with higher observation frequencies exhibit greater robustness, they remain vulnerable when both input challenges are present. These findings highlight important limitations in current DL-based forecasting pipelines and underscore the need for data-aware design strategies. This work contributes to a deeper understanding of DL model behavior in dynamic time-series environments and provides practical insights for developing more reliable and generalizable forecasting systems.

artificial intelligence, machine learning, sequence length, (17 more...)

arXiv.org Artificial Intelligence

2512.06926

Country: Asia > Middle East > Saudi Arabia > Eastern Province > Dhahran (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VEDA: 3D Molecular Generation via Variance-Exploding Diffusion with Annealing

Zhang, Peining, Bi, Jinbo, Song, Minghu

arXiv.org Artificial IntelligenceDec-5-2025

Diffusion models show promise for 3D molecular generation, but face a fundamental trade-off between sampling efficiency and conformational accuracy. While flow-based models are fast, they often produce geometrically inaccurate structures, as they have difficulty capturing the multimodal distributions of molecular conformations. In contrast, denoising diffusion models are more accurate but suffer from slow sampling, a limitation attributed to sub-optimal integration between diffusion dynamics and SE(3)-equivariant architectures. To address this, we propose VEDA, a unified SE(3)-equivariant framework that combines variance-exploding diffusion with annealing to efficiently generate conformationally accurate 3D molecular structures. Specifically, our key technical contributions include: (1) a VE schedule that enables noise injection functionally analogous to simulated annealing, improving 3D accuracy and reducing relaxation energy; (2) a novel preconditioning scheme that reconciles the coordinate-predicting nature of SE(3)-equivariant networks with a residual-based diffusion objective, and (3) a new arcsin-based scheduler that concentrates sampling in critical intervals of the logarithmic signal-to-noise ratio. On the QM9 and GEOM-DRUGS datasets, VEDA matches the sampling efficiency of flow-based models, achieving state-of-the-art valency stability and validity with only 100 sampling steps. More importantly, VEDA's generated structures are remarkably stable, as measured by their relaxation energy during GFN2-xTB optimization. The median energy change is only 1.72 kcal/mol, significantly lower than the 32.3 kcal/mol from its architectural baseline, SemlaFlow. Our framework demonstrates that principled integration of VE diffusion with SE(3)-equivariant architectures can achieve both high chemical accuracy and computational efficiency.

artificial intelligence, machine learning, molecule, (18 more...)

arXiv.org Artificial Intelligence

2511.09568

Country: North America > United States > Connecticut (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Synthetic Data Generation and Differential Privacy using Tensor Networks' Matrix Product States (MPS)

R., Alejandro Moreno, Fentaw, Desale, Palmer, Samuel, de Padua, Raúl Salles, Dixit, Ninad, Mugel, Samuel, Orús, Roman, Radons, Manuel, Menter, Josef, Abedi, Ali

arXiv.org Artificial IntelligenceNov-25-2025

Synthetic data generation is a key technique in modern artificial intelligence, addressing data scarcity, privacy constraints, and the need for diverse datasets in training robust models. In this work, we propose a method for generating privacy-preserving high-quality synthetic tabular data using Tensor Networks, specifically Matrix Product States (MPS). We benchmark the MPS-based generative model against state-of-the-art models such as CTGAN, VAE, and PrivBayes, focusing on both fidelity and privacy-preserving capabilities. To ensure differential privacy (DP), we integrate noise injection and gradient clipping during training, enabling privacy guarantees via Rényi Differential Privacy accounting. Across multiple metrics analyzing data fidelity and downstream machine learning task performance, our results show that MPS outperforms classical models, particularly under strict privacy constraints. This work highlights MPS as a promising tool for privacy-aware synthetic data generation. By combining the expressive power of tensor network representations with formal privacy mechanisms, the proposed approach offers an interpretable and scalable alternative for secure data sharing. Its structured design facilitates integration into sensitive domains where both data quality and confidentiality are critical.

artificial intelligence, machine learning, noise injection, (14 more...)

arXiv.org Artificial Intelligence

2508.06251

Country:

Europe (0.28)
North America > United States (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

AT Proofs

Neural Information Processing SystemsNov-15-2025, 06:17:28 GMT

A.1 Proof of Proposition 1 Proof of Proposition 1. Recall that h denotes the vanilla activations of the network, those we obtain with no noise injection. Let us not inject noise in the final, predictive, layer of our network such that the noise on this layer is accumulated from the noising of previous layers. Let us first consider the Taylor series expansion of the loss function with the accumulated noise defined in Proposition 1. Denoting =[ This can be deduced from the slightly opaque Fa ` a di Bruno's formula, which states that for multivariate derivatives of a composition of functions f: R The final equality comes from the moments of a mean 0 Gaussian, where j takes the values of the multi-index. Though these equalities can already offer insight into the regularising mechanisms of GNIs, they are not easy to work with and will often be computationally intractable. We will include these terms in our remainder term C .

baseline, noise, regulariser, (13 more...)

Neural Information Processing Systems

Technology: